Self-configuring distributed switch

ABSTRACT

A method of interleaving time-critical data packets and delay-tolerant data packets on a shared channel emanating from a control port of a switching node permits a strict time requirement for transmission of time-critical data packets to be met. A control circuit of the switching node stores a local time, an indication of a time required to transfer a delay-tolerant data packet waiting to be transferred, a comparator and a selector to control transfer of the time-critical and delay tolerant data packets.

STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT

[0001] This work was supported by the United States Government underTechnology Investment Agreement TIA F30602-98-2-0194.

TECHNICAL FIELD

[0002] This invention relates generally to high-capacity data switches.In particular, it relates to a self-configuring distributed switch witha channel-switched core which automatically adapts to varying datatraffic loads in a switched data network, and has a very high switchingcapacity.

BACKGROUND OF THE INVENTION

[0003] The volume of data now exchanged through telecommunicationsnetworks requires data networks having a high data transfer capacity.Such networks must also serve large geographical areas. Networkscalability to achieve a very high-capacity and wide-area coverage maybe realized by increasing the number of nodes in a network and/orincreasing the transfer capacity per node. For a given link capacity,e.g., 10 Gb/s, increasing the capacity per node necessitates increasingthe number of links per node. In a balanced network, the mean number ofhops per node pair is inversely proportional to the number of links pernode. Decreasing the mean number of hops per node pair dramaticallyreduces network-control complexity, facilitates the routing function,and enables the realization of network-wide quality of service (QOS)objectives.

[0004] In order to decrease the number of hops per node pair in anetwork, very high-capacity switches are required. Consequently, methodsare required for constructing very high-capacity switches. It is alsodesirable that such switches be distributed to permit switch accessmodules to be located in proximity of data traffic sources.

[0005] Advances in optical switching technology have greatly facilitatedthe construction of high-capacity switches using optical space switchesin the switch core. The principal problem encountered in constructinghigh-capacity switches, however, is the complexity of coordinating thetransfer of data between ingress and egress, while permitting thecreation of new paths between the ingress and the egress. Consequently,there exists a need for a method of increasing data transfer capacitywhile simplifying data transfer control in a high-speed data switch.

[0006] The design of data switching systems has been extensivelyreported in the literature. Several design alternatives have beendescribed. Switches of moderate capacity are preferably based on acommon-buffer design. For higher capacity switches, thebuffer-space-buffer switch and the linked-buffers switch have gainedwidespread acceptance. A switch based on an optical space-switched coreis described in U.S. Pat. No. 5,475,679 which issued on Dec. 12, 1995 toMunter. An optical-core switching system is described in U.S. Pat. No.5,575,320 which issued May 19, 1998 to Watanabe et al.

[0007] A buffer-space-buffer switch, also called a space-core switch,typically consists of a memoryless fabric connecting a number of ingressmodules to a number of egress modules. The ingress and egress modulesare usually physically paired, and an ingress/egress module pair oftenshares a common payload memory. An ingress/egress module pair thatshares a common payload memory is hereafter referred to as an edgemodule. The passive memoryless fabric is preferably adapted to permitreconfiguration of the inlet-outlet paths within a predefined transienttime. The memoryless core is completely unaware of the content of datastreams that it switches. The core reconfiguration is effected by eithera centralized or a distributed controller in response to spatial andtemporal fluctuations in the traffic loads at the ingress modules.

[0008] The linked-buffers architecture includes module sets ofelectronic ingress modules, middle modules, and egress modules, and hasbeen described extensively in the prior art. Each module is adapted tostore data packets and forward the packets toward their respectivedestinations. The module-sets are connected in parallel using internallinks of fixed capacity.

[0009] The control function for the linked-buffers switch is muchsimpler than the control function for the space-core switch. Thecapacity of the linked-buffers switch is limited by the capacity of eachmodule-set, the number of internal links emanating from each ingressmodule, and the number of internal links terminating to each egressmodule. With a given module-set capacity, the capacity of alinked-buffers switch can be increased virtually indefinitely byincreasing the number of internal links, which permits the number ofmodule-sets in the switch to be accordingly increased. However, with afixed module capacity, when the number of internal links is increased,the capacity of each internal link must be correspondingly reduced.Reducing the capacity of an internal link is not desirable because itlimits the capacity that can be allocated to a given connection or astream of connections. A switch with a space switch core does not sufferfrom this limitation.

[0010] The linked-buffers switch can be modified in a known way byreplacing a module-set with a single module having a higher capacitythan that of any of the modules in the module set. As described above, amodule set includes an ingress module, a middle module, and an egressmodule. The modified configuration enables both direct and tandemconnections between ingress and egress and is hereafter referred to as amesh switch. The mesh switch enables direct switching from ingress toegress as well as tandem switching.

[0011] A disadvantage of the switching architectures described above istheir limited scalability.

[0012] Prior art switches may be classified as channel switches thatswitch channels without examining the content of any channel, andcontent-aware data switches. A switched channel network has a coarsegranularity. In switched data networks inter-nodal links have fixedcapacities. Consequently, fluctuations in traffic loads can requireexcessive tandem switching loads that can reduce the throughput andaffect network performance.

[0013] There therefore exists a need for a self-configuring data switchthat can adapt to fluctuations in data traffic loads.

OBJECTS OF THE INVENTION

[0014] It is therefore an object of the invention to provide a veryhigh-capacity switch with a channel-switching core.

[0015] It is another object of the invention to provide an architecturefor an expandable channel-switching core.

[0016] It is yet another object of the invention to provide aself-configuring switch that adjusts its internal module-pair capacityin response to fluctuations in data traffic volumes.

[0017] It is a further object of the invention to provide a data switchthat implements both direct channel paths and tandem channel paths.

[0018] It is yet a further object of the invention to provide a dataswitch in which channel switching and connection routing are fullycoordinated.

[0019] It is a further object of the invention to provide a method andan apparatus for time coordination of connection routing and pathreconfiguration.

[0020] It is a further object of the invention to provide a method ofinterleaving time-critical data and delay-tolerant data on a sharedtransmission medium.

[0021] It is a further object of the invention to provide a method ofassigning inter-module paths so as to maximize the use of directingress/egress data transfer.

SUMMARY OF THE INVENTION

[0022] The invention provides a self-configuring data switch comprisinga number of electronic switch modules interconnected by a single-stagechannel switch. The single-stage channel switch comprises a number P ofparallel space switches each having n input ports and n output ports.Each of the electronic modules is preferably capable of switchingvariable-size packets and is connected to the set of P parallel spaceswitches by W channels, W≦P. A channel may be associated with a singlewavelength in one of M multiple wavelength fiber links, where W/M is apositive integer. The maximum number of modules is the integer part ofn×P/W. The capacity of each module may vary from a few gigabits persecond (Gb/s) to a few terabits per second (Tb/s). The module capacityis shared between the core access links and the outer access links whichare connected to data traffic sources and data traffic sinks, or otherdata switches.

[0023] The channel switch core permits any two modules to be connectedby an integer number of channels. A channel has a predefined capacity,typically several Gb/s. In order to enable the switching of trafficstreams at arbitrary transfer rates, the inter-module connection patternis changed in response to fluctuations in data traffic load. However, itmay not be possible to adaptively modify the paths between modules toaccommodate all data traffic variations, and it may be uneconomical toestablish under-utilized paths for node-pairs of low traffic. Toovercome this difficulty, a portion of the data traffic flowing betweena source module and a sink module may be switched through one or moreintermediate modules. Thus, in effect, the switch functions as a hybridof a channel switch and a linked-buffers data switch, benefiting fromthe elastic path capacity of the channel switch and the ease of controlof the linked-buffers data switch.

[0024] Changes to the channel switch connectivity are preferablycomputed by a global controller which determines changes in theinput-output configurations of some space switches. The reconfigurationmay be implemented in each of the P space switches. To realize a smoothreconfiguration, it is preferable that the connectivity changes beimplemented in one space switch at a time. The central controllerensures that one-to-one mapping, or one-to-many mapping, of the channelsis preserved in order to avoid collision. A collision results frommany-to-one mapping.

[0025] The switching modules need not be collocated with each other orwith the space switch core. Consequently, the respective lengths of thelinks between the switching modules and the switch core may varysignificantly. Hence, a timing mechanism is needed to coordinate thereconfiguration of the inter-module paths to ensure that data is notlost during reconfiguration. The timing mechanism is distributed. One ofthe modules is collocated with the channel switch core and hosts aglobal controller. The other switch modules may be located any desireddistance from the channel switch core. Each of the modules operates alocal cyclical time counter of a predetermined period. Each time thelocal counter turns zero, the module sends a timing packet to the globalcontroller. On receipt of a timing packet, the global controllertime-stamps the packet and places it in a transmit queue from which itis transferred back to its respective module. On receipt of the returnedstamped timing packet, a module extracts the time-stamp information anduses it to adjust its time counter at an appropriate time. Thiscoordinates the local time counter with the global time counter toenable switch reconfigurations with a minimal guard time. The guard timeis also needed to compensate for transient periods in the channel switchduring reconfiguration.

BRIEF DESCRIPTION OF THE DRAWINGS

[0026] Further features and advantages of the present invention willbecome apparent from the following detailed description, taken incombination with the appended drawings, in which:

[0027]FIG. 1a is a schematic diagram of a hybrid switch comprising achannel switch and a data switch interconnecting a bank of electronicmodules;

[0028]FIG. 1b is a schematic diagram of a hybrid switch functionallyequivalent to the hybrid switch of FIG. 1a with the edge modulesperforming the data-switching function;

[0029]FIG. 2 is a schematic diagram of a switch having a bank of edgemodules interconnected by a fully-connected core comprising a bank ofspace switches;

[0030]FIG. 3a is a schematic diagram of a partially-connected spaceswitch core having double the capacity of a corresponding switch with afully-connected core shown in FIG. 2;

[0031]FIG. 3b is a simplified representation of the partially-connectedcore of FIG. 3a, showing the wavelength assignment in a wavelengthdivision multiplexed (WDM) core;

[0032]FIG. 3c is a schematic diagram of a partially-connected spaceswitch which is a mirror-image of the switch of FIG. 3a;

[0033]FIG. 4 is a simplified representation of a partially-connectedcore of four times the capacity of a corresponding fully-connected coreconstructed with the same space switches, the wavelength assignment in aWDM implementation being indicated;

[0034]FIG. 5a shows the connectivity of a partially-connected core withreference to one of the space switches;

[0035]FIG. 5b shows the connectivity of a partially-connected core withreference to one of the space switches, the core connectivity being amirror-image of the connectivity related to FIG. 5a;

[0036]FIG. 6 is a schematic diagram of a hybrid distributed switchshowing the control elements;

[0037]FIG. 7 shows the connection of a module hosting a globalcontroller to the partially connected switch core shown in FIG. 3a;

[0038]FIG. 8 illustrates a data structure used for connection routing ina switch with a fully-connected core;

[0039]FIG. 9 illustrates a data structure used for connection routing ina switch with a partially-connected core;

[0040]FIG. 10 is a schematic diagram of a connection request queue usedin the global controller for processing connection requests fromsubtending data sources;

[0041]FIG. 11 is a schematic diagram of a progress queue used by theglobal controller to track connections in progress;

[0042]FIGS. 12a-d illustrate the process of switch core reconfiguration;

[0043]FIGS. 13a and 13 b illustrate the process of time-indicationalignment at the edge modules;

[0044]FIGS. 14a and 14 b illustrate the phase discrepancy between aglobal timing counter and a timing counter associated with a module,with both counters being up-counters;

[0045]FIGS. 15a and 15 b illustrate the phase discrepancy between aglobal timing counter and a timing counter associated with a module,with both counters being down-counters;

[0046]FIG. 16a illustrates misalignment of a global timing counter and atiming counter associated with a module where the former is anup-counter and the latter is a down counter;

[0047]FIG. 16b illustrates the alignment of a global timing counter anda timing counter associated with a module, the former being an upcounter and the latter being a down counter; and

[0048]FIG. 17 is a schematic diagram of a control circuit used at anegress port for a control channel connecting each module to a modulethat hosts a global controller for a distributed switch. It will benoted that throughout the appended drawings, like features areidentified by like reference numerals.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT

[0049] Definitions

[0050] (1) Source module and sink module: With respect to a givenconnection between a traffic source and a traffic sink, a source moduleis the module supporting the traffic source and a sink module is themodule supporting the traffic sink.

[0051] (2) Link: A physical transmission medium between a signaltransmitter and a receiver; for example, an optical fiber.

[0052] (3) Channel: A designated allotment of the capacity of a linkbetween a signal transmitter and a receiver; for example, a wavelengthin a wavelength division multiplexed (WDM) optical fiber.

[0053] (4) Path: Two or more concatenated channels form a path.

[0054] (5) Connection: A reserved portion of a path.

[0055] (6) Connection routing: The process of selecting a path between asource module and a sink module.

[0056] (7) Channel assignment: The process of selecting the channels toform a path.

[0057] (8) Multiplex: A number of channels multiplexed in at least onetransmission medium.

[0058] (9) Incoming multiplex: A multiplex arriving at a switchingdevice.

[0059] (10) Outgoing multiplex: A multiplex emanating from a switchingdevice.

[0060] (11) Reconfiguration guard time: A time interval during which nodata is transmitted over a connection in order to account for transientperiods during a reconfiguration of connections.

[0061] (12) Ingress port: Port of a switching module receiving data fromsubtending data traffic sources.

[0062] (13) Egress port: Port of a switching module transmitting data tosubordinate data traffic sinks.

[0063] (14) Core-input channel: A channel from a switching module to aswitch core.

[0064] (15) Core-output channel: A channel from switch core to aswitching module.

[0065] (16) Module-pair capacity: In a directly interconnected modulepair, the lesser of a sending capacity of a first module and a receivingcapacity of a second module in the pair.

[0066] (17) Fully-connected module-pair: A directly connected modulepair which is connected by a set of paths having a combined capacityequal to the module-pair capacity. The paths may be shared by othermodule pairs.

[0067] (18) Partially-connected module-pair: A directly connected modulepair connected by a set of paths having a combined capacity which isless than the module-pair capacity.

[0068] (19) Fully-connected switch: A switch in which all module pairsare fully connected. In a fully-connected switch, the paths connectingany given module pair may be congested under certain traffic conditions.

[0069] (20) Partially-connected switch: A switch in which some modulepairs are partially-connected pairs.

[0070] (21) Normalized traffic unit: A dimensionless traffic unitdefined as the data rate divided by a channel capacity. The data rateand the channel capacity are normally expressed in bits per second.

[0071] (22) Clock period: A time interval between successive clockpulses.

[0072] (23) Time-counter period: A period D of a digital counter used toindicate time. The period D is less than or equal to 2^(C), C being aword length of the counter.

[0073] (24) Data switch: A data switch receives data from a number ofincoming channels, identifies predefined data units, and directs thedata units to respective outgoing channels. Data switches includetelephony switches, frame-relay switches, ATM switches, and IP routers.In a network based on data switches, the inter-switch channel allocationis fixed.

[0074] (25) Channel switch: A memoryless switch that connects any of anumber of incoming channels to any of an equal number of outgoingchannels without examining the data content of any channel. Theinterconnection may be effected by a bank of space switches, and theinterconnection pattern may be modified. However, an interval betweensuccessive interconnection reconfigurations is preferably much longerthan a mean transfer time of a data unit. For example, in a data packetswitch, the mean packet transfer time may be of the order of 100 nsecwhile the mean channel switching period would be of the order of a fewmilliseconds. In a network based on channel switches, the inter-modulechannel allocations are time-variant. End-to-end paths whose capacitiesmatch the respective end-to-end data traffic are formed by rearrangingthe connectivity of the channels.

[0075] (26) Data traffic routing: Data traffic routing is the process ofdirecting an identifiable data unit or a stream of such units to a pathselected from a set of two or more paths. The path is predefined and maycomprise a number of concatenated channels, each channel having adefined point of origin and a defined destination.

[0076] (27) Module-state matrix: A 2×N matrix, where N is the number ofmodules. Entry (0.0, j), 0≦j<N, stores the combined available vacancy onall channels from a module j to the channel-switching core and entry (1,j), 0≦j<N, stores the combined available vacancy on all channels fromthe channel-switching core to the module j.

[0077] (28) Channel-vacancy matrices: A matrix having a number ofcolumns equal to the number of incoming multiplexes and a number of rowsequal to the number of space switches. Each entry in the matrix isinitialized by a number representative of the capacity of a channel anddynamically stores a respective available capacity.

[0078] (29) Vacancy matching process: A first channel vacancy matrix anda second channel-vacancy matrix are compared to determine the lesser oftwo corresponding entries. The first matrix stores the availablecapacity in each channel from an incoming multiplex to channel switchcore. The second matrix stores the available capacity in each channelfrom the channel switch core to an outgoing multiplex. Comparing twocolumns of the first and second matrices determines the availablecapacity between respective incoming and outgoing multiplexes.

[0079] The present invention provides a hybrid switch that combines thebenefits of a channel switch with the benefits of a data switch. In aself-configuring switch in accordance with the invention, the controlsystem enables the creation of inter-module paths, and controls therouting of connections to existing or new paths. The path configurationis changed slowly, in milliseconds for example, thus providing theswitch control system sufficient time to compute required core pathreconfigurations.

[0080]FIG. 1a is a schematic diagram of a hybrid switch in accordancewith the invention which includes N electronic modules 84, a channelswitch 82 and a dedicated data switch 83 which switches only tandemconnections. Each module 84 receives data traffic from subtendingtraffic sources through incoming feeder links 86 and delivers datatraffic destined to subordinate sinks through outgoing feeder links 87.Local subtending data traffic is switched directly to subordinate sinksthrough each module 84 as indicated by the dashed line 85. Each module84 receives W incoming channels 92 from the channel switch 82, and sendsW channels 93 to the channel switch 82. Each module 84 also receives Bchannels 96 from the data switch 83 and sends B channels 97 to the dataswitch 83.

[0081]FIG. 1b is a schematic diagram of a hybrid switch similar to thatshown in FIG. 1a, except that the data switch 83 is eliminated andtandem data switching is performed at the edge modules as indicated bythe dashed line 88 of FIG. 1b. The configuration of FIG. 1b enableshigher efficiency than that of FIG. 1a due to the sharing of channels 92and 93 by direct traffic and tandem switched traffic.

[0082] High-Capacity Core

[0083] The capacity of a switch based on a space switch core augmentedby tandem switching can be expanded to a high capacity because the spaceswitch connectivity requires reconfiguration less frequently, ifcomplemented by tandem switching. The capacity of the space switchitself is, however, a limiting factor. Further capacity growth can berealized using a parallel arrangement of space switches. Using opticalspace switches, and with wavelength-division multiplexing, the parallelspace switches may operate at different wavelengths in a manner wellknown in the art.

[0084]FIG. 2 is a schematic diagram of a wavelength-multiplexed switch100 in accordance with the invention having a known configuration of awavelength-multiplexed space switch core 82. The space switch core 82includes a bank of W identical (n×n) space switches 102, each spaceswitch 102 having n inputs and n outputs, n>1, each input being achannel of a predefined wavelength. All inputs to a given space switch102 are of the same wavelength. Demultiplexer 104 separates themultiplexed channels in incoming multiplex 94 into individual channels112, which are routed to different space switches 102 according to theirrespective wavelengths. The switched channels at the output of eachspace switch 102 are connected to multiplexers 106 and the multiplexedswitched channels are grouped in at least one outgoing multiplex 96 andreturned to the ingress/egress modules 84. The input-output connectionpattern for each space switch 102 is determined by a global controllerthat is described below in more detail.

[0085] The capacity of the switch 100 is limited by the capacity of eachof the space switches 102 and the number of channels in each incomingmultiplex 94. The number of core-output channels grouped in at least oneoutgoing multiplex 96 is preferably equal to the number of core-inputchannels grouped in at least one incoming multiplex 94.

[0086]FIG. 3a shows an example of a wavelength multiplexed space switchcore 120 with a number of space switches 102 larger than the number ofchannels in an incoming multiplex 94. In this example, each incomingmultiplex comprises four channels and the demultiplexed channels arerouted to four inner links 122. Sets of four inner links 124 arewavelength multiplexed onto outgoing multiplexes 96. The incomingmultiplexes are divided into two groups labelled “A:0” and “A:1”. Theoutgoing multiplexes are divided into two groups labelled “B:0” and“B:1”. The channels of an incoming multiplex are divided as shown sothat some channels are routed to outgoing multiplex “B:0” and theremaining channels are routed to outgoing multiplex “B:1”. With equalgroup sizes, and with even division of the internal channels 122, themaximum number of channels that an incoming multiplex can switch to agiven outgoing multiplex equals the number of channels of the incomingmultiplex divided by the number of groups of outgoing multiplexes. Thenumerals shown in space switches 102 represent the respectivewavelengths they switch. The pattern of wavelengths switched by thelower group of the space switches 102 is a shifted pattern of thewavelengths switched by the space switches 102 in the upper group.

[0087] If the channels of an incoming multiplex are wavelengthmultiplexed, each space switch 102 is associated with a wavelength andthe space switches are arranged according to their designated wavelengthin such a way as to avoid the duplication of any wavelength in anyoutgoing multiplex.

[0088]FIG. 3b is a simplified representation of the configuration shownin FIG. 3a.

[0089]FIG. 3c shows the same example described with reference to FIG.3a, except that the connection pattern between the input demultiplexersand the space switches and the connection pattern between the spaceswitches and the output multiplexers are reversed. The space switchcores shown in FIGS. 3a and 3 c are functionally equivalent.

[0090]FIG. 4 illustrates a configuration in which the number of edgemodules is four times the number of input ports for each space switch102. In this configuration, the edge modules are logically divided intofour groups. The space switches 102 are also logically divided into fourgroups, as are the outgoing multiplexes. Each edge module routes only aquarter of its channels through the space switch core to any group.Likewise, each module can route at most one quarter of its channelsthrough the core to any other edge module. Greater inter-moduleconnectivity is realized through tandem switching. The space switches ineach group are arranged in a shifted pattern in accordance with thewavelengths they switch. The channels of each incoming multiplex aredistributed equally among the four groups of space switches. FIG. 4shows 16 space switches 102 divided into four groups. The respectivegroups switch wavelengths {0, 1, 2, 3}, {1, 2, 3, 0}, {2, 3, 0, 1}, and{3, 0, 1, 2}. The incoming multiplexes are divided into four groupslabelled A:0 through A:3, and each group includes four incomingmultiplexes. The outgoing multiplexes are divided into four groupslabelled B:0 through B:3, and each group includes four outgoingmultiplexes. Each group of space switches is directly associated with agroup of outgoing multiplexes. Each incoming multiplex has fourchannels. The channels of each incoming multiplex in group A:0 areassigned to corresponding space switches 102 in the four groups of spaceswitches. For example, the four channels of an incoming multiplexbelonging to group A:0 are assigned to the first space switch in each ofthe space switch groups B:0 through B:3. As is apparent, thisarrangement of the core ensures that there is no duplication of awavelength in any outgoing multiplex.

[0091]FIG. 5a depicts the connection pattern for any channel switchassembled with G groups, G>0, numbered 0 to G-1. Each incoming andoutgoing channel is identified by a group number, a relative multiplexnumber within the group, and a channel number. There are G×W spaceswitches, numbered sequentially from 0 to G×W−1. FIG. 5a relates tospace switch number S which is associated with wavelength Λ. A link 122from group number [S/G]_(G) (the ratio S/G modulo G), multiplex numberm, and a channel corresponding to wavelength Λ connects to input port mof space switch S, 0≦m<n, 0≦S<G×W. An output port m of switch S isconnected by link 114 to channel corresponding to wavelength Λ, inmultiplex m, in group └S/W┘, where └u┘ denotes the integer part of areal number u. For example, in FIG. 3a, wavelength number 3 in multiplex3 in group 0 is connected to input port number 3 in space switch number6, while output port number 3 in space switch number 6 is connected by alink 124 to wavelength number 3 in outgoing multiplex number 3 in group1.

[0092]FIG. 5b shows the connection pattern for a channel switch corewith a connectivity that is a mirror image of the connectivity of thechannel switch core represented in FIG. 5a.

[0093] Control Mechanism

[0094] As described above, in a hybrid switch in accordance with theinvention, the channel switch core must be controlled to reconfigure inresponse to changes in traffic loads. FIG. 6 illustrates a mechanism forchannel assignment and switching-time coordination in a hybrid switchschematically shown in FIGS. 1a and 1 b. Several electronic data switchmodules 84 are interconnected through a wavelength-multiplexed channelswitch core 82. At least one of the modules 84 a is collocated with thechannel switch core 82 and hosts a global controller 162 which includesa time counter circuit 164. Global controller 162 receives traffic loadinformation from each local controller 172 of modules 84, including itshost module 84 a, and determines desirable configuration changes for thecore using an algorithm that is described below. In addition, controller162 determines a time at which each reconfiguration of the core mustoccur. The global controller 162 periodically reviews the configurationof the switch to determine whether reconfiguration of the core isrequired. In order to provide the global controller 162 with trafficvolume and timing data, each module 84 must have at least one pathrouted to module 84 a, which hosts the global controller 162.

[0095] The configuration or reconfiguration of the connectivity of eachof the space switches in the wavelength multiplexed space switch core 82must be coordinated with corresponding switching processes in themodules 84. The time counter circuit 164 associated with the globalcontroller 162 includes a global clock and a time counter (not shown). Atime counter circuit 174 in each module controller 172 of each module84, 84 a includes a module clock and a time counter, preferably havingan identical period to that of the global clock in time counter circuit164. The global controller 162 communicates with the modules 84 tomaintain a measurable difference between a value of each time counter ina circuit 174 and the time counter in circuit 164. The propagation delaybetween the modules and the global controller 162 must be taken intoaccount in determining a core reconfiguration schedule. Without precisecoordination between the modules 84 and the space switch core 82, someconnections may be forced to an idle state for relatively long periodsof time to ensure that data is not lost during a switch corereconfiguration.

[0096] The host module 84 a switches payload data traffic as well ascontrol data traffic. Global controller 162 is preferably connected toonly one ingress/egress port of host module 84 a. The egress port ofmodule 84 connected to the global controller 162 is hereafter referredto as the control port of the module 84. Each channel directed to theglobal controller 162, carries timing data, hereafter called type-1data, and traffic related or payload data, hereafter called type-2 data.Type-2 data is relatively delay-insensitive. The type-1 data must betransferred without delay, either according to a schedule or in responseto a stimulus. At least one register stores the type-1 data and at leastone buffer stores the type-2 data in each module 84. The traffic volumeof the type-2 data is generally much greater than that of the type-1data.

[0097] A selector enables data units from one of the buffers to egressat a given time. When a timing packet arrives, it must egress at apredefined time and transfer control must be exercised to ensure thatthe transfer of a packet from a type-2 buffer does not interfere withthe transfer of the type-1 data. A transfer control circuit associatedwith the control port enables egress of the two traffic types whileensuring adherence to the strict time requirement of the type-1 data, aswill be explained below in detail with reference to FIG. 17.

[0098]FIG. 7 illustrates the channel connectivity from each incomingmultiplex 94 to module 84 a which hosts the global controller 162. Eachmultiplex must provide at least one channel to module 84 a in order toprovide access to the global controller 162. The switch configurationshown in FIG. 3a is used in this example. An incoming multiplex and anoutgoing multiplex connect each module to the space switch core. Eachincoming multiplex 94 has one of its channels routed to one of twodemultiplexers 202. A demultiplexer 202 is needed per group. The channelfrom an incoming multiplex 94 to a demultiplexer 202 carries controldata units and payload data units. The control data units include bothtraffic load measurement data and timing data. Similarly, module 84 aroutes a channel to each outgoing multiplex 96.

[0099] Channel-Switch Reconfiguration

[0100] Each module has a fixed number W of one-way channels to the core,and it receives a fixed number, preferably equal to W, of one-waychannels from the core. The former are hereafter called A-channels, andthe latter are called B-channels. A path from a module X to a module Yis formed by joining an A-channel emanating from module X to a B-channelterminating on module Y. Connecting the A-channel to the B-channel takesplace at a core space switch. The number of paths from any module to anyother module can vary from zero to W. The process of changing the numberof paths between two modules is a reconfiguration process which changesthe connection-pattern of module pairs. A route from a module X toanother module Y may have one path or two concatenated paths joined at amodule other than modules X or Y. This is referred to as a loop path. Alarger number of concatenated paths may be used to form a route.However, this leads to undesirable control complexity.

[0101] If the core is not reconfigured to follow the spatial andtemporal traffic variations, a high traffic load from a module X to amodule Y may have to use parallel loop-path routes. A loop-path routemay not be economical since it uses more transmission facilities and anextra step of data switching at a module 84, 84 a. In addition, tandemswitching in the loop path adds to delay jitter.

[0102] Reconfiguration of the core is performed concurrently with aconnection-routing process. Two approaches may be adopted. The first, apassive approach, joins free A-channels to free B channels withoutdisturbing connections in progress. The second, an active approach, mayrearrange some of the connections in progress in order to pack theA-channels and B-channels and hence increase the opportunity of havingfree A channels and B channels to create a larger number of new paths.Rearrangement of a connection to free a channel is subject to the timingcoordination required in any reconfiguration. It is noted that freeingan A-channel of a path while keeping the B-channel unchanged is apreferred practice since it does not require pausing of data transfer atthe source module after a new path is created.

[0103] It is emphasized that the objective of reconfiguration is tomaximize the proportion of the inter-module traffic that can be routeddirectly without recourse to tandem switching in a loop path. However,connections from a module X to a module Y which collectively require acapacity that is much smaller than a channel capacity preferably useloop-path routes. Establishing a direct path in this case is wastefulunless the path can be quickly established and released, which may notbe feasible. For example, a set of connections from a module X to amodule Y collectively requiring a 100 Mb/s capacity in a switch corewith a channel capacity of 10 Gb/s uses only 1% of a path capacity. If acore reconfiguration is performed every millisecond, the connection frommodule X to module Y would be re-established every 100 milliseconds toyield a 100 Mb/s connection. This means that some traffic units arrivingat module X may have to wait for 100 milliseconds before being sent tomodule Y. A delay of that magnitude is unacceptable and a bettersolution is to use a loop path where the data traffic for theconnections flows steadily through a tandem switched loop path throughone of the edge modules other than modules X or Y.

[0104] Path Formation

[0105] Any of the channels belonging to an incoming multiplex has fixedconnectivity with a predetermined set of space switches. Those channelsmay be paired with channels from the predetermined set of space switchesto the outgoing multiplexes. The paired channels form inter-modulepaths. In a WDM core, each incoming or outgoing multiplex connects to Wspace switches, W being the number of wavelengths (channels) in eachmultiplex.

[0106] A module pair may be connected by an integer number of paths,ranging from zero to the number of channels in a multiplex. During aswitch reconfiguration period, the number of paths connecting amodule-pair may change, and new connections may be routed to existing ornewly-created paths. It is also possible to reroute an existingconnection to another path in order to free a path used by theconnection and thus facilitate the formation of new paths between othermodule pairs.

[0107] The channel assignment process will first be described for thefully-connected channel switch (G=1) shown in FIG. 2. FIG. 8 illustratesmemory tables used in the channel assignment process in switch 100 shownin FIG. 2. The example shown is that for four incoming multiplexes eachincluding eight channels: Two matrices 242, and 244, are used tofacilitate the assignment process. Matrix 242 stores indicators of thevacancies in incoming multiplexes and matrix 244 stores indicators ofthe vacancies in outgoing multiplexes. The symbols shown in FIG. 8identify the channels of each multiplex. This is for illustration only,numeric values representative of the respective vacancies being used inan actual implementation of the assignment procedure. As shown, the fouroutgoing multiplexes 0, 1, 2, and 3 receive 0, 4, 2, and 1 channels,respectively, from incoming multiplex 0. The channel assignment processwill be described below in more detail with reference to FIGS. 10 and11. The outcome of the assignment process is stored in a matrix 246,each row of which corresponds to one of the space switches. Each entryin matrix 246 has a width of log₂n bits (rounded up to nearest integer),n being the number of input ports in a space switch 102, and storing theidentity of the output port of the same space switch 102 to which aninput port is connected.

[0108]FIG. 9 illustrates the channel assignment process for an extendedswitch 140 shown in FIG. 4. In this example, there are 16 incomingmultiplexes, each including 8 channels (W=8). The incoming multiplexesare divided into equal groups (G=4) labeled A:0, A:1, A:2, and A:3. Thesymbols used in the figure identify channels of corresponding incomingmultiplexes in the four groups. A small space switch core is used herefor ease of illustration. Typically, n=16, G=4, W=128, i.e., N=G×n=64,leading to an inner capacity equal to N×W×R=8192 R. With R=10 Gb/s, thisis 80 Tb/s.

[0109] A matrix 262 is used to indicate the unassigned capacity of inputports of a space-switch group to facilitate the channel assignmentprocess. The four matrices 262 are represented separately forillustration only. The four matrices can be interleaved in a single 4×32matrix. Each matrix 264 has 4×8 entries, each entry indicates theunassigned capacity in a respective output port of a respective spaceswitch. The outcome of the assignment process is stored in a companionmatrix 264 of 4×32 entries, each entry being log₂n bits wide (roundedup) and storing the identity of an output port to which the respectiveinput port is to be connected.

[0110] Referring to FIG. 8 and FIG. 9, a matching operation involves asimple comparison of two corresponding entries, one in matrix 242 (262in FIG. 9) and the other in matrix 244 (264 in FIG. 9), followed by asubtraction if a connection is assigned. (Recall that G denotes thenumber of groups, n the number of inputs or outputs per space switch,and W is the number of channels per incoming or outgoing multiplex). Thechannel switch 82 is fully connected if G=1, and partially connected ifG>1. The number of modules is N=n×G. A fully connected channel switch 82with N modules would require W space switches of N inputs and N outputs.The use of more than one group (G>1) reduces the complexity of the spaceswitch design and reduces the matching effort, but full connectivity issacrificed.

[0111] A partially-connected channel switch cannot serve as a switchcore unless augmented with tandem loop-path switching to handle spatialtraffic variations. Full connectivity of the channel switch may benecessary during periods of severe spatial imbalance in data trafficloads. With partial connectivity, the disparity of module-pair loads canlead to a significant proportion of the traffic being forced into looppaths.

[0112] Core Reconfiguration and Channel Assignment

[0113] As explained above, a connection is routed to a path between asource and a sink. A module 84 receiving a connection request from asubordinate traffic source (not shown) is a source module, and themodule 84 hosting the sink (not shown) is a sink module. A direct pathbetween a source module 84 and a sink module 84 comprises a channel fromthe source module to a space switch 102 in the switch core and a channelfrom the space switch 102 to the sink module. A tandem loop path betweena source module and a sink module comprises two direct paths, one fromthe source module through a space switch in the core to an intermediatemodule, and one from the intermediate module through a space switch inthe core to the sink module. The intermediate module is any module,except the source and sink modules.

[0114] When a source module receives a connection request, it sends therequest to the global controller 162 (FIG. 6). The global controller 162routes connections to paths, and reconfigures the channel connections inthe core as required to accommodate temporal and spatial fluctuations intraffic loads. Preferably, the connection routing process is performedperiodically. The time between successive connection routing processesis preferably equal to a reconfiguration period. Connection requestsreceived by each module 84, 84 a from subtending traffic sources (notshown) are transferred to the global controller 162 for processing.Connection requests received by the global controller 162 from themodules during a reconfiguration period are preferably processed in abatch.

[0115] The channel assignment process includes the following steps:

[0116] (I) The global controller 162 maintains a 2×N module-state matrix(not shown) storing the free capacities of the N modules 84, 84 a. Onerow of the matrix stores each module's available capacity on channelsconnecting the module to the core and the second row of the matrixstores the available capacity on channels connecting the core to eachmodule.

[0117] (II) When a new connection request is sent from a module 84, 84 ato the global controller 162, the sink module is identified. Thecorresponding entries in the module-state matrix are examined. If eitherentry is smaller than the connection capacity requested in theconnection request, the connection request is placed in a standby queue(not shown). Otherwise, the connection request is entered in aconnection request queue 280 shown in FIG. 10, and the entries in the2×N module-state matrix are debited accordingly. Each entry in theconnection request queue 280 includes three fields: a source moduleidentifier 282, a sink module identifier 283, and a requested connectioncapacity 284. The standby queue has the same format as the connectionrequest queue. The connection capacity requested is preferablyrepresented as a fraction of a capacity of a channel. A 20-bitrepresentation of the channel capacity, for example, permits an integerrepresentation of each fraction with a relative accuracy within 1 permillion. A request entered in the request queue may be accepted if aninternal route can be found as described in the following steps:

[0118] (1) The request queue is sorted in a descending order accordingto capacity requirement before a matching process begins;

[0119] (2) An attempt is made to find a direct path from the sourcemodule to the sink module for each connection request in the requestqueue. This involves carrying out a matching process as described above.The matching process is implemented for each entry in the request queuestarting with the highest requested connection capacity. A request for ahigh connection capacity has fewer matching opportunities than a requestfor a small connection capacity. Thus, the requests for higherconnection capacities are preferably processed before the availablechannel capacity is assigned to requests for low capacity connections.

[0120] Each time a connection request is successfully assigned, each ofthe corresponding entries in the channel-vacancy matrices (242, 244) or(262, 264) is decreased by the value of the assigned capacity.

[0121] Each successful connection is deleted from the request queue,assigned an internal connection number, and entered in a progress queue.The internal connection number is selected from a pool of K recycledconnection numbers in a manner well understood in the art. If all the Kconnection numbers are assigned, processing the request queue is stoppedfor the reconfiguration period in progress and resumes in subsequentreconfiguration periods. The number K is the maximum number ofconnections that can be supported at any given time. This value isselected to be sufficiently large to render the event of a full progressqueue improbable. A full progress queue results in delaying theprocessing of the request queue until a subsequent reconfigurationperiod.

[0122] The progress queue preferably has K columns and six rows, and acolumn is indexed by the internal connection number. The six rows in theprogress queue (FIG. 11) are used to store the source module identifier292, intermediate module identifier 293 (if any), sink module identifier294, space switch identifier 295 in first path, space switch identifier296 in second path (if any), and capacity assigned 297, respectively.The intermediate module and second space switch entries are null in thecase of a direct path. The progress queue is stored in a memory accessedby the global controller 162. When this step is complete, the requestqueue contains only the requests that could not be routed via directpaths.

[0123] (3) An attempt is made to find a loop path which requires tandemswitching at an intermediate module as described above for any requestsremaining in the request queue. The remaining connection requests areprocessed sequentially. The process includes a step of finding amatching path from the source module to an intermediate module and amatching path from the intermediate module to the sink module.

[0124] (4) Each request that can be assigned a loop path is deleted fromthe request queue, assigned an internal connection number as describedabove, and entered in the progress queue. A column corresponding to aloop path in the progress queue includes the source module identifier292, the intermediate module identifier 293, the sink module identifier294, the first connecting space switch identifier 295, the secondconnecting space switch identifier 296, and the capacity assigned 297.

[0125] (5) The remaining requests in the request queue are rejected inthe current reconfiguration cycle and the respective capacities 284indicated in the request queue are credited in the 2×N module-statematrix (not shown) as described in step(I) above.

[0126] (6) If any request is rejected in step 5, the queue of standbyrequests is examined to determine if any standby request can exploit thevacancy created by the rejected request. Steps 1 to 5 are repeatedreplacing the request queue with the standby queue. The standby queue ispreferably sorted in a descending order according to the value of therequested connection capacity.

[0127] (7) When a connection is terminated, its assigned capacity isadded to corresponding entries-in the channel-vacancy matrices (242,244) or (262, 264), and the 2×N module-state matrix and the connectionnumber is returned to the pool of recycled connection numbers. Thus, thecorresponding column in the progress queue becomes available for use bya new connection. Initially the free-capacity arrays store the totalinternal capacity of the respective modules. The channel vacancymatrices are initialized to contain the capacity of a channel.

[0128] In order to increase the opportunity of accommodating futurerequests, the space switches should be scanned in a sequential orderfrom 0 to W−1 in each matching attempt, and the intermediate modules inloop paths are attempted in a sequential order.

[0129] Centralized Switch Reconfiguration

[0130] In a centralized switch, edge modules are located in the vicinityof the space switch core and the propagation delay between each module84 (FIG. 6) and the optical channel switch 82 may be sufficiently smallto be contained within a relatively short reconfiguration guard time.The core reconfiguration process can be frequent, the constraint on thefrequency being principally the speed of the global controller 162. Theglobal controller 162 sends connection-change requests to allparticipating modules a given lead time prior to a reconfigurationtarget time T, the lead time being sufficient to permit eachparticipating module to implement the required connection rearrangementby the target time.

[0131] Distributed Switch Reconfiguration

[0132] It is desirable that the modules 84 be located close to theirtraffic sources and not necessarily in the vicinity of the space switchcore 82. Consequently, the propagation delay between a module 84 and thespace switch core 82 may be of the order of a millisecond or so. Aninterval of the order of a millisecond is too long to be practically andeconomically contained in a guard time.

[0133] Two main requirements stem directly from the variance of thedelay from the modules to the channel switch. The first is the need toalign the local time counter 174 at each module 84 with the global timecounter 164 at the global controller 162, which is used as a referencetime. The time counter alignment must be based on the individualpropagation delays from each module to the space switch core 82. Thesecond is a restriction on connection reconfiguration to account for apropagation delay variation between the space switch core 82 and thesink module. The latter requires that a traffic connection re-routedfrom a loop path to either a direct route or another loop path pause fora predetermined interval of time in order to ensure that no data intransit can arrive at the destination module after the data transferredto the destination module via the new route. A transfer from a directpath to a loop path or another direct path does not result inout-of-sequence data blocks.

[0134] Selection of the Time Counter Period

[0135] (1) The period D of a time counter (164, 174) must be at leastequal to the sum of a largest propagation delay between any module 84and the global controller 162 and a time allowance sufficient for anymodule to implement a connection reconfiguration.

[0136] Timing Control

[0137] All clocks in the time counter circuits 164 and 174 (FIG. 6) aresynchronized using techniques well known in the art. As noted above,time coordination is required to harmonize the switching function in theswitch core 82 and the modules 84, 84 a to ensure that no data units arelost during switch core reconfiguration.

[0138]FIGS. 12a-d illustrate the time coordination performed during thereconfiguration process. In this example, each space switch in the corehas 16 input ports and 16 output ports. Arrays 306 and 308 (FIG. 12a)show the input-output connectivity of a given 16-port space switch core102 before and after reconfiguration. In this example, global controller162 (FIG. 6) has determined that the space switch core connectivityshould be reconfigured so that ingress module 1 connects to egressmodule 11 instead of egress module 8, ingress module 6 connects toegress module 2 instead of egress module 12, etc. The new connectivityin the switch core is shown in FIG. 12b. The required changes are shownin underline and bold type in FIGS. 12a and 12 b and include inputs 1,6, 10 and 14. The reconfiguration also requires a change in transferbuffer pointers (not shown) at the ingress modules 1, 6, 10 and 14 sothat data units for the new destination are output on the respectivechannels after the reconfiguration shown in FIG. 12d. As shown in FIGS.12c and 12 d, packets 312 are the last packets that the affected ingressmodules transmit through the core prior to reconfiguration, and packets314 are the first packets transmitted through the core afterreconfiguration. The separation between packets 312 and 314 represents aguard time to account for a reconfiguration transition delay at thespace switch core 82 (FIG. 6). The packet streams from inputs 1, 6, 10and 14 are sent at their local times T and arrive at the space switchcore at global time T, as determined by collocated global time counter164, in accordance with a transmit time coordination method inaccordance with the invention.

[0139] Timing packets are exchanged between each local time countercircuit 174 and global time counter circuit 164 of the global controller162. Each module controller 172 transmits a timing packet when its localtime counter reaches zero. This may be performed each time the localtime counter reaches zero, or after a predetermined number of cyclesdetermined to be appropriate. All time counters have the same widths ofC bits, 20 bits for example, with a counter period of δ×2^(C), δ beingthe clock period. The clock period is the same in all modules. Forexample, with C=20 and δ=100 nsec, the counter period is about 100 msec.The timing packet is sent to the global controller 162. Upon receipt ofeach timing packet, controller 162 stamps the packet according to thereading of global time counter 164. The stamped packets are queued andtransmitted back to their source modules. The timing counter at thesource module is reset to zero when it reaches a value determinedaccording to the time stamp in the returned timing packet. The method ofdetermining the resetting time is described in detail below. By doingso, a packet transmitted at local time X at any module will alwaysarrive at the core channel switch at global time X. Thus, when theglobal controller 162 determines that reconfiguration of one of thespace switches 162 is desirable, it computes a desirable time T foreffecting the reconfiguration then it sends the value T in areconfiguration packet to the affected modules as illustrated in FIG.13b. The reconfiguration request packet sent to a module also containsrelevant data on the new connectivity of the space switch. Preferably,the reconfiguration request packets are sent at global time “Ø” and thereconfiguration target time is specified as time T is equal to D. Themodule then performs the necessary internal switchover of trafficstreams when its local time counter is equal to time T.

[0140]FIG. 13a illustrates an exchange of timing packets between amodule 84 and the global controller 162. A module 84 sends a timingpacket at time t₁, as indicated in the time axis 324. The packet isreceived at the global timing circuit 164 at time t₂, t₂>t₁, asindicated on line 322. The value of t₁ need not be known to the globalcontroller 162. The global controller 162 inserts the value t₂ in thetiming packet and at some later instant returns the packet to the module84. The module controller 172 is then aware of the values t₁ and t₂, anduses this information to adjust its local time counter 174. The timecounters are cyclic and, as described above, t₁ may be zero forsimplicity. Similarly, another module 84 transmits its timing packet tothe global controller 162 at time x₁ as indicated on line 326. Thetiming packet is received at the global controller 162 at time x₂, asindicated on line 322. On receipt of the timing packet at time x₂, theglobal controller 162 time stamps the packet and returns it, asdescribed above.

[0141] Time Coordination Process

[0142]FIG. 13b illustrates the time coordination process to enable pathsto be reconfigured, necessitating changes in the core. The timecoordination process requires that the global controller 162 issue areconfiguration request packet that is multicast simultaneously to allparticipating modules. As indicated in line 322 of FIG. 13b, the globalcontroller sends the reconfiguration request packets to two modules. Thereconfiguration request packet includes the desired reconfiguration timeT, in addition to the information on the new connectivity of the core.The local time T in the first module TCC 174, as indicated on line 324,and the local time T in the second module time counter 174, as indicatedon line 326, differ in accordance with their propagation delays to theglobal controller 162. When each module transmits a bit at its localtime T, the respective bits from the modules simultaneously reach thechannel switch core at the global time T.

[0143] If the modules 84 and the channel switch core 82 are co-located,time coordination using the process described above is unnecessary. Inthat case, the global controller 162 may broadcast reconfigurationpackets to all modules before the reconfiguration target time T,permitting a predetermined interval for implementing the reconfigurationchanges required at the affected modules.

[0144] The time coordination process may be implemented using differentkinds of counters. The time coordination process using up-counters atthe global time counter 164 and at module time counters 174 isillustrated in FIGS. 14a and 14 b. The time coordination process usingdown-counters at the global time counter 164 and the module timecounters 174 is illustrated in FIGS. 15a and 15 b. The time coordinationprocess using an up-counter at the global time counter 164 anddown-counters at module time counters 174 is illustrated in FIGS. 16aand 16 b.

[0145]FIG. 14a and FIG. 14b illustrate the time counter resettingprocess at the local time counter circuit 174 in each module 84 inresponse to packets echoed by the global time counter circuit 164. FIG.14a shows the case where the local time counter in a circuit 174 isleading the time counter of global circuit 164 and FIG. 14b shows theopposite case.

[0146] In FIG. 14a, the output 342 of the global time counter (shown indotted lines) in circuit 164 and the output 344 of a local time counter(shown in solid lines) in a circuit 174 are shown as a function of time.The output Y is time-shifted by the magnitude of the propagation delaybetween a given module 84 and the global controller 162. Line 344represents the local time counter output as if the entire output weretransmitted to the global controller 162. A zero phase difference ispreferable and in the figure, the outputs 342 and 344 are synchronizedbut are not aligned. When the output of the local time counter is zero,the module sends a timing packet to the global controller which respondsby writing a current value of its global time counter y (346) at thetime of receipt of the timing packet and places the timing packet in aresponse queue. When the timing packet is returned to the module, themodule controller 172 resets its local time counter to zero when itsoutput reaches a complement (D−y) where “y” equals the global time stampinserted in the packet referenced as 348 in FIGS. 14a, 14 b., and “D” isthe time counter period. If D is a power of 2, then the complement (D−y)is the is complement of

[0147] Similarly, the two counters may be down counters, as illustratedin FIG. 15a and FIG. 15b.

[0148] Preferably, the time indicator at the global controller is anup-counter of C bits and the time indicator at each module is a downcounter of C bits, the time counters period D being 2^(C) times theclock period. When a module receives a stamped timing packet, it resetsits down counter by resetting each of its C bits to “1”. This isillustrated in FIG. 16a and FIG. 16b. Perfect alignment results as shownin the pattern illustrated in FIG. 16b.

[0149] Interleaving of Time-Critical and Delay-Tolerant Signals

[0150] As described above, each module 84 has at least one channel,called the control channel, connected to module 84 a hosting the globalcontroller 162 as shown in FIG. 6. The egress port connecting thechannel is hereafter called the control port of the module. The controlchannel carries timing packets, other control packets, and payload data.When a timing packet arrives at a type-1 buffer, it must egress at apredefined instant and the transfer of a packet from a type-2 buffer maybe in progress at that instant. A circuit 380 shown in FIG. 17associated with the control port enables egress of the two traffic typeswhile meeting the strict time requirement for transmission of the type-1data.

[0151] Timing packets are type-1 data, while all other data can toleratesome jitter and is classified as type-2 data. At least one buffer 384stores packets of type-1 data and at least one buffer 382 stores packetsof type-2 data. The traffic volume of the type-2 data is likely to bemuch greater than that of the type-1 data.

[0152] Each of the type-1 packets must be transferred in accordance witha strict time schedule. The transfer of type-1 and type-2 data packetstreams on a shared channel is enabled by the circuit shown in FIG. 17.The circuit 380 is required at each module for the channel connected tothe global controller 162. The circuit 380 includes the payload packetbuffer 382, the timing packet buffer 384, a payload packet transferduration indicator 386, and an output 388 of time counter circuit 174(FIG. 6). Buffer 382 stores type-2 packets and buffer 384 stores type-1timing packets. The indicator 386 stores a value representative of thetime required to transfer a type-2 packet stored buffer 382, and theindicator 388 stores the output of the local time counter. If the localtime counter is an up-counter, the output stored in the time counteroutput indicator is a 1^(S) complement of the reading of the local timecounter. The timing packet must be transmitted when the value stored inindicator 388 is zero. The time remaining before a timing packet has tobe transferred is indicated by the counter output stored in indicator388. When a type-2 packet has been transferred, a buffer selection isdetermined. If timing packet buffer 384 is empty, any packet stored intype-2 buffer 382 is permitted to egress. Otherwise, if the entry inpayload packet indicator 386 is smaller than the entry in time counteroutput indicator 388, the type-2 packet is transferred since thetransfer of the type-2 packet will be complete before the time scheduledfor transferring the type-1 packet. If the entry in the payload packetduration indicator 386 is larger than the entry in the timing counteroutput indicator 388, data transfer is disabled since the stored type-2packet would not be completely transferred before the requested releasetime of the type-1 timing packet. When the time counter output indicator388 reads exactly zero, and a timing packet is stored in buffer 384, thetiming packet is transferred. A comparator 392 compares the contents ofthe payload packet duration indicator 386 and time counter output 388and produces a two-bit output Q. The output Q is “00” if a reading ofthe time counter output 388 is smaller than a reading of the payloadpacket duration indicator 386, “10” if the opposite is true, and “11”whenever the reading of the time counter output 388 is zero. The 2:1selector connects the outgoing channel to the type-1 packet buffer 384if Q is “11”, or the type-2 packet buffer 382 if Q is “10”. Otherwise,the 2:1 selector 390 goes to an idle state. This circuit enables bothpayload and timing packets to be transferred via a channel used for thetransfer of control messages and timing packets.

[0153] It will be understood by those skilled in the art that theforegoing description is intended to be exemplary only. Changes andmodifications to the described embodiments will no doubt become apparentto skilled persons. The scope of the invention is therefore intended tobe limited solely by the scope of the appended claims.

[0154] The embodiment(s) of the invention described above is(are)intended to be exemplary only. The scope of the invention is thereforeintended to be limited solely by the scope of the appended claims.

We claim:
 1. A self-configuring switch, comprising: a channel-switchedcore; a plurality of data switch modules connected to thechannel-switched core; and a controller that dynamically configures thechannel connections among data switch module pairs to adapt to spatialand temporal variations in data traffic loads.
 2. A self-configuringswitch, comprising: a memoryless channel switch for performing channelswitching according to a predetermined time schedule; a globalcontroller, the global controller including a global time countercircuit and a channel assignment mechanism, the channel assignmentmechanism computing a predetermined time schedule for the memorylesschannel switch; and a plurality of data switch modules connected tolocal data traffic sources through incoming links and to local datatraffic sinks through outgoing links, and connected to the channelswitch by a plurality of outgoing channels and a plurality of incomingchannels, each module having a local controller and a local time countercircuit, means for sorting data units into data groups corresponding tothe respective channels, a memory for storing the sorted data groups,means for determining a volume of data in each data group, means forreporting each volume of data to the global controller, means forreceiving control signals from the global controller, and means fortiming the transfer of sorted data from the data groups in accordancewith the control signals.
 3. The switch as claimed in claim 2 whereinthe links connecting each module to the channel switch are wavelengthmultiplexed and the channel switch comprises a bank of opticaldemultiplexers, a bank of space switches, and a bank of opticalmultiplexers.
 4. An optical channel switch, comprising: an integer G>1groups of incoming wavelength multiplexes each group including aninteger n>1 of multiplexes, and each multiplex supporting an integer W>1of optical wavelength channels; the integer G groups of wavelengthdemultiplexers, each demultiplexer being associated with an incomingmultiplex; the integer G groups of optical space switches, each groupincluding the integer W space switches, each space switch having n inputand n output ports, the space switches being serially assigned numbersfrom 0 to (G×W−1), and a space switch assigned a number S beingassociated with wavelength number [S+G] modulo W; the integer G groupsof wavelength multiplexers, each multiplexer being associated with anoutgoing multiplex; and the integer G groups of outgoing multiplexes,each group including n multiplexes, and each multiplex having Wchannels; whereby the incoming multiplexes, the space switches, and theoutgoing multiplexes are connected such that: an m^(th) input port ofeach n x n space switch S, the space switch being associated with awavelength Λ, 0≦m<n, 0≦S<G×W, and 0≦Λ<W, is connected to a channel inthe m^(th) incoming multiplex belonging to a group [S/G] modulo G, thechannel corresponding to wavelength A; and the m^(th) output port ofeach n×n space switch S, 0≦m<n and 0≦S<G×W being connected to a channelcorresponding to wavelength Λ in the m^(th) outgoing multiplex belongingto an outgoing multiplex group, the group number being determined as theinteger part of the ratio (S/W).
 5. The optical channel switch asclaimed in claim 4 wherein the connection pattern between the incomingmultiplexes and the space switches, and the connection pattern betweenthe space switches and the outgoing multiplexes are reversed.
 6. Amethod of coordinating connection changes in a distributed switchcomprising a plurality of data switching modules interconnected by achannel switch, the switching modules being spatially distributed suchthat propagation delay periods from the respective modules to thechannel switch are generally unequal and a global controller associatedwith the channel switch has a negligible propagation delay due toproximity to the channel switch, the global controller being designatedas a global time keeper, each of the modules having a cyclic timecounter of a predetermined period D, and any one of paths connecting agiven module to other modules may be reconfigured by the globalcontroller so that internal connections within the given module must bechanged accordingly, comprising steps of: broadcasting from the globalcontroller a reconfiguration request message which includes a targettime T at which connection reconfiguration at the channel switch willoccur; receiving the broadcast target time at each module affected bythe reconfiguration and performing necessary connection changes at atime selected so that the data transmitted by each module after theconnection changes occur arrive at the channel switch at the target timeT, and the data are routed according to the re-arranged paths at thechannel switch.
 7. The method as claimed in claim 6 wherein the targettime T is implicitly assigned a default value equal to the time counterperiod D and the value of T is not included in the reconfigurationrequest message.
 8. The method as claimed in claim 6 wherein the globalcontroller broadcasts its reconfiguration request message which includesa target time T at an instant T1<T so that the difference between T andT1 does not exceed a period of the cyclic time counter in each of themodules.
 9. The method as claimed in claim 6 wherein the period of thecyclic time counter at each module exceeds a largest propagation delaybetween any module and the global controller plus a time allowancesufficient for any module to implement a connection reconfiguration. 10.The method as claimed in claim 6 wherein the period of the cyclic timecounter is a power of
 2. 11. The method as claimed in claim 6 whereinthe propagation delay from each module to the channel switch isnegligible and the global controller broadcasts reconfiguration requestpackets to all modules before the target time T, permitting apredetermined interval to implement the reconfiguration changes requiredat each affected module.
 12. The method as claimed in claim 6 whereinthe reconfiguration timing is not critical and timing information is notexchanged between the global controller and the switching modules.
 13. Amethod of timing coordination in a distributed switch comprising aplurality of data switching modules interconnected by a channel switch,a global switch controller being collocated with the channel switch andhaving negligible transmission delay to and from the channel switch, theglobal controller including a global time counter, each module having alocal time counter, propagation delays from the modules to the globalcontroller being generally unequal, the method comprising steps of:sending a timing packet from each of the modules to the globalcontroller when the local cyclic time counter of a respective module iszero; attaching a time indication of the global time counter to thetiming packet upon receiving the timing packet at the global controller,the time indication corresponding to a global time counter indication onreceipt of a first bit of the timing packet; placing the timing packetsin a transmit queue at the global controller; sending the timing packetsback to the respective modules; and resetting the local time counters ofthe respective modules according to information extracted from therespective timing packets.
 14. The method as claimed in claim 13 whereinthe global time counter at the global controller is an up-counter of Cbits and the local time counter at each module is a down-counter of Cbits, and when a module receives a timing packet, the module extractsthe time indication and resets its down-counter by resetting each of itsC bits to “1” when the reading of its down-counter equals the timeindication.
 15. The method as claimed in claim 13 wherein the timecounters at both the global controller and the modules are up-counterseach of C bits and when a module receives a timing packet, it extractsthe time indication, determines a 1^(S)-compliment of the timeindication, and resets its up-counter to zero when its time counterreading equals the 1^(S)-compliment.
 16. The method as claimed in claim13 wherein the time-counters at both the global controller and themodules are down-counters each of C bits and when a module receives atiming packet, it extracts the time indication, determines its1^(S)-compliment, and resets its down-counter to zero when its readingequals the 1^(S-)compliment of the time indication.
 17. A mechanism forenabling an interleaving of delay-sensitive type-1 and delay-toleranttype-2 packets on a shared channel in a communication module having alocal time counter C bits, the module transmitting type-1 and type-2data packet streams on a shared channel, the mechanism comprising: atype-1 packet buffer; a type-2 packet buffer; a register for storing atransfer duration of a packet waiting in the type-2 packet buffer, thetransfer duration being represented by an integer X≧0; an integer valueproduced from the output of a time counter; a comparator for comparingthe values of the time counter and the transfer duration and producing atwo-bit output Q, the output Q being “00” if transfer duration issmaller than the integer value of the time counter, and “11” if theinteger value of the time counter zero, regardless of the value of thetransfer duration; and a 2:1 selector for selecting a packet from thetype-1 packet register if Q is “11”, the type-2 packet register if Q is“10”, and remaining in an idle state if Q is “00”.
 18. The mechanism asclaimed in claim 17 wherein the time counter is an up-counter, thecounter period is a power of 2, and the integer value is a1^(S)-compliment of a value of the up-counter.
 19. A method ofreconfiguring a channel switch having a global controller, N incomingmultiplexes and N outgoing multiplexes connecting N data switchingmodules, comprising: storing available outgoing and incoming capacity ofeach data switching module in a 2×N module-state matrix; identifying thesource module, the sink module, and the required capacity of eachconnection request; communicating the identifiers of each connectionrequest to the global controller; the global controller tentativelyaccepting the request if the required capacity does not exceed theavailable outgoing capacity of the source module or the availableincoming capacity of the sink module; the global controller storing theidentifiers of connection requests in a connection-request queue if therequest is tentatively accepted; the global controller storing theidentifiers of connection requests in a standby queue if the request isnot accepted; periodically sorting the entries of the request queue andthe entries of the standby queue, if any, in a descending orderaccording to the capacity requirements of the queued requests;periodically performing a matching process to assign each request in therequest queue to a direct path; if a direct path is not found,performing a double-matching process to assign each request that cannotbe routed in a direct path to a loop path; placing the identifiers of asuccessfully assigned connection in a progress queue; retaining theidentifiers of unassigned connection requests for a subsequentassignment cycle; if any request in the request queue is not assigned toa path, repeating the matching process using the standby queue insteadof the connection request queue; modifying the entries in the 2×Nmodule-state matrix following each of the assignment decisions above;and communicating the assignment results to modules.
 20. The method asclaimed in claim 19 wherein the channel switch is an optical channelswitch comprising: an integer G>1 groups of incoming wavelengthmultiplexes each group including an integer n>1 of multiplexes, and eachmultiplex supporting an integer W>1 of optical wavelength channels; theinteger G groups of wavelength demultiplexers, each demultiplexer beingassociated with an incoming multiplex; the integer G groups of opticalspace switches, each group including the integer W space switches, eachspace switch having n input and n output ports, the space switches beingserially assigned numbers from 0 to (G×W−1), and a space switch assigneda number S being associated with wavelength number [S+G] modulo W; theinteger G groups of wavelength multiplexers, each multiplexer beingassociated with an outgoing multiplex; and the integer G groups ofoutgoing multiplexes, each group including n multiplexes, and eachmultiplex having W channels; wherein a matching process is performedusing a first matrix of n columns and G×W rows each entry of whichindicates a free capacity in an input channel of a space switch and asecond matrix of n columns and G×W rows each entry of which indicates afree capacity in a respective output channel of a respective spaceswitch, the outcome of the matching process being an identifier of anoutput channel corresponding to each input channel of a space switch,the identifier being stored in a matrix of n columns and G×W rows, eachentry of which being log₂n wide, rounded up to a nearest integer. 21.The method as claimed in claim 20 wherein the spatial matching processfollows a predetermined search order for each entry in the first matrix.